Inferring Incidence of Unreported SARS-CoV-2 Infections Using Seroprevalence of Open Reading Frame 8 Antigen, Hong Kong

We tested seroprevalence of open reading frame 8 antigens to infer the number of unrecognized SARS-CoV-2 Omicron infections in Hong Kong during 2022. We estimate 33.6% of the population was infected, 72.1% asymptomatically. Surveillance and control activities during large-scale outbreaks should account for potentially substantial undercounts.

1 Laboratory methods

Specimen collection
Specimens of clotted blood were collected from the patients.The serum was separated from the blood by centrifugation and stored at -80°C until use.Approval for the study was obtained from the Joint Chinese University of Hong Kong-New Territories East Cluster Clinical Research Ethics Committee (ref no.: 2020.229).

Protein expression and purification
The full length ORF8 protein of SARS-CoV-2 was fused with a C-terminal His6 tag and was cloned into a customized pCAGEN vector.The ORF8 protein was expressed by transfecting purified plasmid into Expi293 cells using Gibco Expi293 Fectamine Transfection kit.The cells were incubated in a shaker at the condition of 8% CO2, and 37 o C for 6 days.The ORF8 protein from the supernatant was then purified by Ni-NTA and concentrated using a 3K Protein concentrator.
It has been shown that SARS-CoV-2 contains a unique ORF8 accessory gene that is absent from other known human pathogenic coronaviruses (1).In our previous study (2), we found that ORF8, as a non-structural protein, was only expressed in COVID-19 patients, and it was highly immunogenic when compared to other ORF and structural proteins.Thus, seropositive to ORF8 antigen provides evidence of being infected by SARS-CoV-2, and this approach has been used by us and others (3,4).

ELISA binding assay
The 96-well enzyme-linked immunosorbent assay (ELISA) plates (Nunc MaxiSorp, Thermo Fisher Scientific) were first coated overnight with 100 ng per well of purified recombinant protein in PBS buffer.The plates coated with the purified recombinant protein were then blocked with 100ul of Chonblock Blocking/Sample Dilution ELISA Buffer (Chondrex, Inc, USA) at room temperature for 2 hours.Each serum or plasma sample was tested at a dilution of 1:100 in Chonblock Blocking/Sample Dilution ELISA Buffer and 100ul of diluted sample was added to the ELISA wells of each plate for 2-hour incubation at 37°C.After extensive washing with PBS containing 0.1% Tween 20, HRP-conjugated goat anti-human IgG (1:5000, GE Healthcare) was added for 1 hour at 37°C.The ELISA plates were then washed five times with PBS containing 0.1% Tween 20.Subsequently, 100 μL of HRP substrate (Ncm TMB One) (New Cell & Molecular Biotech Co. Ltd, China) was added into each well.After 15 minutes incubation, the reaction was stopped by adding 50 μL of 2 M H2SO4 solution and analyzed on an absorbance microplate reader at 450 nm wavelength.
The assay was initially validated using 100 negative controls and 100 convalescent sera from adults in Hong Kong who had recovered from Omicron infection one month prior.We defined a serum to be positive for ORF8 antigen if the OD value was 3 standard deviations (SD) above the mean of the negative controls, which in our assay was 0.28.

Statistical modelling 2.1 Model specification
We consider an individual who was infected by SARS-CoV-2 in period from January 1 to June 20, 2022 can be classified into six mutually exclusive and exhaustive types according to the detection and reporting status.These included the follows

Detected and reported
• Type (#1): correctly detected by PCR test with test sensitivity  PCR and 100% reported, • Type (#2): correctly detected by RAT with test sensitivity  RAT and reported with selfreporting ratio  RAT ,

Detected and unreported
• Type (#3): correctly detected by RAT with test sensitivity  RAT and unreported with ratio 1 −  RAT ,
In each class above, the distribution of composed types can be determined given the information of  PCR ,  RAT , and  RAT .From existing literature, we set  PCR at 99.0%, and  RAT at 81.0%.Fort the specificity, considering that both RT-PCR test and RAT have high level of specificity (> 99.9%) for detecting SARS-CoV-2 infection, we set specificity to be 100%, which would unlikely to change the modelling results but saved a large level of model complexity.
Different values of  RAT could be considered in further analysis.

Likelihood framework
To estimate the daily number of SARS-CoV-2 infections,  all (), and parameters  1 ,  2 and  RAT , we construct the following two likelihood functions.The likelihood functions were defined and thus can be updated on a daily basis, so that the depletion of susceptible population was accounted.
Here,  denoted the fixed population size, which was 7.5 million in Hong Kong, IAR() denoted the infection attack rate with range from 0 to 1,  TEST denoted the test sensitivity of the laboratory test (i.e., ORF8 test) used for samples collected in this study, and  TEST was fixed at 75%.Among 100 negative controls (from collected external source, data not shown) who were free of SARS-CoV-2 infection and were used to check the specificity of ORF8 test, we detected 0 false-positive subject, and thus we set the specificity of ORF8 test in this model to be 100%.

Parameter estimation
The parameters to be estimated were daily number of SARS-CoV-2 infections,  all (), and  1 ,  2 and  RAT .We combined the two likelihood functions, i.e., multinomial distribution for cases reported by PCT test or RAT and hypergeometric distribution for the samples among those who were unaware of their testing status, so that all parameters can be estimated simultaneously.We adopted a Bayesian fitting procedure with Metropolis-Hastings algorithm, which was a Markov chain Monte Carlo (MCMC) method, with non-informative prior distributions for parameter estimation.The MCMC method was practiced with 10 chains and 100,000 iterations for each chain, including 40,000 iterations for the burn-in period, to obtain the posterior estimates.The convergence of each MCMC chain was visually checked using trace plots and the Gelman-Rubin-Brooks diagnostic quantitatively (5).The median and 95% credible intervals (95% CrI) of the posterior distributions of model parameters were calculated for summary.

Extended discussion on limitations
This study has limitations.
First, among the samples of 1028 self-claimed uninfected individuals recruited in this study, female ratio (63.9%) was higher than the situation of Hong Kong population (54.3% in 2022 from the Census and Statistics Department).This disproportion of females is likely due to female individuals are generally more willing to participate in survey activities related to health, which leads to a higher proportion of females in our samples.However, according to the literature about the COVID-19 epidemic situation in Hong Kong (6,7), the association between sex and risk of SARS-CoV-2 infection is unlikely.Therefore, the disproportional females in our 1028 samples are unlikely to bias our estimation of overall infection attack rate of Hong Kong population, because both sexes are likely to have the same infection attack rate (as well as risk of infection).
Second, initial infection attack rate IAR(0) at 0.2% was based on real-world situation of COVID-19 epidemic in Hong Kong from January 2020 to December 2021, under the background of the previous "zero-COVID" policy.We noted that the cases in Hong Kong were reported with high infection-detecting efforts, contact tracing and disease control intensity before 2022, and large-size of cases number was unlikely to occur.Although it is difficult to have information on the exact value of initial infection attack rate, and slight changes in this setting is unlikely to affect our main findings.In addition, we neglected the re-infection scenario for this 0.2% of the population, which simplified the analysis and had minor impact to the IAR estimates.